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ABSTRACT 

Temporal analysis of radiation from Astrophysical sources like Active Galactic Nuclei, 
X-ray Binaries and Gamma-ray bursts provide information on the geometry and sizes 
of the emitting regions. Robustly establishing that two light-curves in different energy 
bands are correlated and measuring the phase and time-lag between them is an im- 
portant and frequently used temporal diagnostic. Analytical expressions to estimate 
the errors on the cross-correlation, phase and time-lag between two light-curves are 
presented. Earlier estimates depended upon numerically expensive simulations or on 
dividing the light-curves in large number of segments to find the variance. Thus, the 
analytical estimates presented here allow for analysis of light-curves with relatively 
small 1000) number of points, as well as to obtain information on the longest 
time-scales available. The error estimation is verified using simulations of light-curves 
derived from both white and 1/f stochastic processes with measurement errors. As a 
demonstration, we apply this technique to the XMM-Newton light-curves of the Active 
Galactic Nucleus, Akn 564. 

Key words: accretion, accretion discs - X-rays:binaries -methodsianalytical - galax- 
ies:active - galaxies individual: Akn 564 



1 INTRODUCTION 

Establishing that two light-curves, measured in different en- 
ergy bands, are correlated with each other is an impor- 
tant temporal diagnostic for various kinds of Astrophysical 
sources, especially for Active Galactic Nuclei (AGN) and X- 
ray binaries. The detection and measurement of the level 
of correlation constrains the number of radiative processes 
active in the source and can be used to validate (or rule 
out) models based on spectral analysis. Phase and time- 
lags detected for correlated light-curves can provide further 
insight into the geometry and size of the emitting region. 
Often in these applications, the light curves available for 
analysis are of short duration and have measurement errors. 
The true temporal behaviour of a source can only be es- 
tablished if there are robust estimates of the errors on the 
cross-correlation, phase and time-lags. 

It is important to emphasize that a cross-correlation 
analysis between two finite length light-curves will not pro- 
vide an accurate measure of the correlation between them, 
even in the absence of measurement errors. Intrinsic stochas- 
tic fluctuations in the light curves will induce an error on the 
cross-correlation measured. An estimate of the significance 



and error of the cross-correlation detected, should take into 
account both, measurement errors as well as statistical fiuc- 
tuations. 

The standard method to estimate the error on the cross- 
correlation involves dividing the light curves into several 
equal segments and finding the cross-correlation for each. 
Then the net cross-correlation is given by the average of the 
different segments and the variance is quoted as an error. 
For example, this technique is implemented by the function 
"crosscor" of the high energy astrophysics software HEA- 
The method is reliable only if the light curves can 
be divided into a large number of segments (>> 10) and each 
segment is sufficiently long and not dominated by measure- 
ment errors. The temporal behaviour of many astrophysical 
systems depends on the time-scales of the analysis and hence 
by using this method, one loses information on the behaviour 
of the system on time-scales comparable to the length of the 
original data. In AGN, the time scale involved is long compa- 
rable to the length of observation in many cases, hence it is 
not practical to divide the light curve in segments. Moreover, 
there does not seem to be any established way by which this 
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method can be extended to get an estimate of the time-lag 
between the Ught curves and its error. 

These deficiencies can be overcome by using a Monte 
Carlo technique where one simulates a large number of pairs 
of light curves having the same assumed temporal proper- 
ties and with the same measurement errors as the original 
pair. The results of the original pair can be compared with 
the simulated ones to ascertain the confidence level of the 
cross-correlation and time-lag. The simulated light curves 
should take into account the stochastic fluctuations of the 
light curves and not just the measurements errors. Indeed, 
when the light curve is sampled unevenly and with mea- 
surement errors changing in time, the Monte Carlo tech- 
niq ue may be the only w ay to obtain reliable estimates (for 
e.g. IPeterson et al]|l998h . Monte Carlo technique is numer- 
ically expensive and hence are not practical for analysis of 
a large sets of data. More importantly, the results depend 
on the subjectivity of the assumed temporal properties of 
the system. For example, to ascertain the errors on an ob- 
served cross-correlation and time-lag value, the simulations 
are generally done with the assumption that these are the 
true intrinsic values. Similar assumptions have to be made 

on the shape of the power spectra of the light curves^ 

As pointed out and discussed extensively by IWelshI 

l|l999t ). an analytical estimate of the variance on cross- 
correlation is not straight forward. In the literature, there 
is an analytical esti mate for the cr oss-correlation known as 
Bartlett's equation (|Bartlettlll955l ) which is not often used 
in Astronomical contexts. This method is available in the 
"crosscorrelation" function in the IMSL numerical librariefl 
The error is accurate only when the complete knowledge 
of the cross-correlation and auto-correlation functions are 
available. Its effectiveness for short duration light curves is 
uncertain. Moreover, this error estimate does not naturally 
translate into error estimates for the phase and time lag 
between the light curves. 

Complete information regarding the temporal relation 
between two light curves can be obtained by computing the 
coherence and time-lag as a function of Fourier frequency. 
A detailed description o f the technique as w ell as physical 
interpretation is given bv lNowak et al.l l|l999l ). The two light 
curves are divided into many segments and for each segment 
a Fourier transform is undertaken and coherence and phase 
lag as a function of frequency is estimated. For the different 
segments, the coherence and phase lags are averaged and 
their errors can be estimated analytically. Such detailed in- 
formation can only be obtained for long light curves which 
can be split into several segments. In the absence of such 
rich data, statistically significant results can be obtained by 
averaging over Fourier frequencies. Indeed, from this view 
point the cross-correlation, is in some sense, the average of 
the coherence over all frequencies. However, computing the 
error on the cross-correlation using the error estimates for 
the coherence is not straight forward. First, the averaging 
has to be appropriately weighted by the power in each fre- 
quency bin. Secondly, the error estimate for the coherence 
is reliable only if the error itself is small, which is the case 
when many segments are averaged and not necessarily true 



for the coherence at a single frequency bin obtained from a 
single segment. 

In this work, we present an analytical estimate for the 
cross-correlation between two light curves. The error esti- 
mate is based on the Fourier transforms of the light curves 
and is based on the correct averaging over different fre- 
quency modes. In §2, the estimate is derived and verified 
by simulations with and without measurement errors. §3 
highlights the difficulties in estimating a time-lag and its 
error using the standard method of finding the peak of the 
cross-correlation function. The cross-Correlation phasor is 
introduced in §4 which leads to an estimate of the phase lag 
between the light curves. In the same section, a technique 
is introduced by which one can measure the time lag and 
its error. In this method the time-lag measured can be even 
smaller than the sampling time bin of the light curves. The 
complete fully self contained algorithm is presented in §5 
for easy reference. As an example, in §6, the technique is 
applied to the XMM-Newton light curves of the highly vari- 
able and well studied AGN, Akn 564. In §7, the summary 
and discussion includes a list of important assumptions on 
which the technique is based and provides examples when 
the assumptions may not be valid. 



2 ANALYTICAL ERROR ESTIMATE OF 
CROSS-CORRELATION 

2.1 Light curves without measurement errors 

We first consider an idealised case of two light curves without 
measurement errors. The two light curves are assumed to be 
recorded in discrete equally spaced time intervals, /S.t and 
the mean is subtracted from each of them. It is assumed that 
they are partially linearly dependent such that, 

= X, (1) 
Yj = z,+Axj (2) 

where Xj and Zj are time-series produced by two indepen- 
dent stochastic processes. Each time series can be conve- 
niently represented by its discrete Fourier transform, X^, 
defined as 

JV-l 

Xfe = ^ Xj exp {2Tvijk/N) (3) 

and a power estimate Pxk = 2\Xk\'^ can be obtained. For a 
stationary system, the ensemble average (i.e. average of an 
infinite number of realisations) of the power, < Pxk >, is 
a characteristic of the stochastic process. A power derived 
from a single time series, Pxk is only an estimator of its 
value. In particular the real and imaginary parts of X^ vary 
independently can be derived from two ind ependent Gaus- 
sian distributions (|Timmer fc Koeniel[l995h . The deviation 
of Pxk from < Pxk > is roughly equal to < P^k > i.e. the 
power estimate from a single light curve has nearly 100% 
error. The variance ax = ^ Pxk is again an estimate of the 
ensemble averaged variance < ct\ >= X/ ^ ^^k >■ 

One can define the cross-correlation estimate of the two 
time series as 
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where 



CXY 



N/2-1 



test the resuhs obtained from simulations. The transforma- 
tion chosen for the analysis is 
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Here and Yk are Discrete Fourier transforms of Xj and 
Yj respectively and 
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where the subtraction of c^y in the denominator may make 
it nearly independent of the numerator. The average devia- 
tion of P can be estimated to be, 

2 < cxY > Acxy 



AP : 



(12) 



(6) 



where the variation of the denominator has been neglected. 
AP is related to ACxy by 

dP . . „ 2CxY 



Their ensemble averages are < cxy >= A < >, 
< ax >=< o-^ > and < cry >=< > +A'^ < al >. 
Cxy has the useful property that its ensemble average 
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Using Eqns ([12]) and (|T3 
deviation of Cxy to be 
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one can obtain an estimate for the 
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^< ul > (< 0-1 > -f < cr| >) 



(7) 



ACj, 



1- < Cxy 
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(14) 



is zero if the two light series are uncorrelated (i.e. A = 0) 
and unity if they are completely correlated (i.e. when < 
>= 0). However, Cxy is only a measure of < Cxy > 
and the deviation between them needs to be quantified. 

The average deviation of cxy from < cxy > can be 
estimated to be (Appendix A: Egn lAGI) 



(Acxi 



Naturally, ACxy depends on ensemble averaged quantities 
which characterise the stochastic processes that have pro- 
duced the light curves. Typically, one does not have a priori 
information of the stochastic processes and the ensemble av- 
eraged quantities need to be estimated from the light curves. 
Thus the best estimate of ACxy can be obtained by replac- 
ing these ensemble averaged quantities with the measured 
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If one does not have a priori information about the stochas- 
tic process then < \Xk\'^ > and < \Yk\'^ > have to be esti- 
mated using the measured values. 
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(15) 
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(Acxy)' ~ (AcW)' = ^ J2 



(9) 



k=-N/2 



To ascertain whether there is a detectable correlation 
between the two light-curves (i.e. |Cxy| > 0) it is first nec- 
essary to show that, at some confidence level, \cxy\ > 0. One 
can define a null hypothesis sigma level aNH ~ \cxy\/ Ac'^y 
and fix a criterion (a prudent one being ctjvh > 3) to as- 
certain whether any correlation has been detected. It is im- 
portant to note that only if the criterion is satisfied should 
one proceed to estimate the degree of cross-correlation Cxy 
otherwise any such attempt will not only be incorrect but 
also meaningless. 

If Cxy is uncorrelated with -^(r^o-y, then 



For practical situations AC^y can be used as an estimate 
for the error on Cxy- 



2.2 Comparison with results from simulations 

We generated 200 indepe ndent light-curve s usin g 
the method d escrib ed by iTimmer fc Koeni3 l|l995h . 
IVaughan etHI l|2003h discuss the different methods to 
generate stochastic light curve s and give arguments fo r 
favouring the one prescribed bv lTimmer fc Koenid ()l995h . 
The intrinsic power spectrum of the stochastic process 
was assumed to be a power-law i.e. P(/) oc /~". The 
light-curves were generated of length 8A^ and rebinned to a 
length of A'^, to avoid aliasing effects. From these 200 light 
curves, 19900 pairs of the light curves were generated which 
obey. 



ACj, 



AcxY 



(10) 



Y, 



2.7 + AXi 



(16) 
(17) 



where the variation in yV^a£has been neg lected. However, ^^gre Xj and Zj are two different simulated light-curves, 
as discussed extensively by|w3([l99i), (jy is, in gen- 
eral, correlated with cxy- In particular, Acxy depends on 
the variation of cr^ through the term Aa\. 

A possible solution is to define a transformation, 
P{Cxy) whose terms are not correlated (or at least not so 
correlated). Then estimate the expected variation for that 
function, AP and use that to obtain an estimate for ACxy- 
Below we describe such a transformation and subsequently 



The cross-correlation, Cxy was computed for each pair. For 
TV = 1024, a = and for three different values of ^ = 0, 1 
and 5, the histograms of the Cxy are plotted in the top 
panel of Figure 1. These histograms, Hj axe normalised such 
that their summation '^HjS = 1 where S is the bin size. 
They are compared with a normalised Gaussian distribution 
with a centroid value equal to the expected averaged cross- 
correlation of 
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Figure 1. Comparison of simulation with analytical results. 
19900 pairs of light-curves of length N = 1024 were created 
for Xj = Xj and Yj = Zj + Axj. xj and Zj are independent 
time-series generated from a stochastic white noise process (i.e. 
power spectrum index « = 0). Top Panel compares the nor- 
malised histogram of CxY with a Gaussian with centroid at the 
expected < CxY > and width given by cr = ACxy (Eqn[l4}. 
Bottom panel shows the histogram of the cross-correlation vari- 
ation {CxY- < CxY >)/AC'^y. If AC'^Y (Eqn[l5ll, which is 
estimated from the pair of light-curves, is a true measure of the 
variation, then the distribution should be a zero centred Gaussian 
with (T = 1 (solid line). 



< CxY > = 



A 



(18) 



and with width a equal to ACxy computed using Eqn (I14p . 
As can be seen, the normalised Gaussian distributions de- 
scribe well the simulated results which validates the method 
and assumptions used to estimate AC'xy in the previous 
subsection. However, in practical situations one has to use 
the approximation AC'xy to estimate the variance which in 
general will vary for each pair of Ught curves. In the bottom 
panel of Figure 1, we plot histograms of deviation of Cxy 
from the average < Cxy > normalised by the estimated de- 
viation AC'xy i-e- (Cxy- < Cxy >)/ AC'xy- H ^C'xy is 
an accurate measure of the variation of Cxy then the distri- 
bution of the normalised variation should be a zero centred 
Gaussian with a — 1. The plot verifies this prediction by 
comparing the distribution with such a Gaussian shape. The 
distributions agree well with each other except for large A, 
where the Gaussian distribution is slightly broader than the 
simulation results. This implies that the AC'xy is a slight 
overestimation of the true deviation for large values of A. 

Figure 2 shows the same comparison as Figure 1, but for 
the case when the power-law index a — 1. Qualitatively the 
comparison between the expected and obtained distribution 
are similar to the a — case, except for some quantitative 
differences. For the same length of the fight curves, ACxy is 
larger for a — 1. Since Cxy is by definition constrained to be 



Figure 2. Same as Figure 1, except that Xj and Zj are indepen- 
dent time-series generated from a stochastic 1// noise process 
(i.e. power spectrum index a = 1). The width of the distributions 
are broader than for white noise. 



less than unity, the distribution differs from the symmetric 
Gaussian shape for large A. The bottom panel shows that 
AC'xy is a better representation of the variation than it was 
for a = 0. 

For white noise (i.e. a = 0), the dependence of ACxy 
on the length of the light-curves is oc l/\/7V, while for 
a — 1 the dependence is weaker ~ l./logN for large A'^. 
The original light-curves may be divided into M parts, and 
cross-correlations of each may be averaged. For a = this 
will not lead to any change in the accuracy with the fi- 
nal ACxy being nearly the same. However, for q = 1, 
ACxy oc l/(\/Mlog(A''/Af)) which would give a much bet- 
ter accuracy than finding the cross-correlation for the whole 
light-curve. However, such a cross-correlation will not have 
information about the behaviour of the system on timescales 
corresponding to duration of the original light curve. 

For simplicity, the simulations undertaken here are for 
the case when both the uncorrelated light curves Xj and Zj 
arise from stochastic processes having the same power spec- 
tral shape i.e. P{f) oc f~°'. However, the results obtained 
are general and are also true when the spectral shapes are 
different. 



2.3 Light curves with measurement errors 

We next consider a more realistic case, where the light- 
curves have measurement errors. In particular. 



xj + exj 

Zj 4- Axj + eYj 



(19) 



where Xj and Zj are time-series produced by two indepen- 
dent stochastic processes as before and exj and eYj are the 
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known measurement errors for measuring Xj and Yj respec- 
tively. The cross-correlation is now defined as 

CXY 



Cx5 



(20) 



where cxy is the same as before (Eqn[5]) and axE and (tye 
are the rms variation of the measured errors i.e. 
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(21) 




and similarly for uye- 

The expressions for < cxy > and Acxy remain the 
same as for the measurement error free case discussed pre- 
viously and following the same procedure as before, one can 
estimate 

< Cxy >'^)Acxy 
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analogous to Eqn (|14p . To this error estimate we have to 
add the fluctuations of axE and ayE around their ensemble 
averaged values < ax e > and < cry e > ■ Note that it is these 
ensemble averaged values < axE > and < ayE > that are 
known a prion and not axE and (Jye- If the measurement 
errors are Gaussian white noise (as is generally the case) 
then, Aax YE = i'^/y^)<^x,YE (see Appendix: Eqn lASp . 
Moreover since the fluctuations are independent of the true 
signal, they can be added to ACxyi using standard error 
propagation techniques. Thus 
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Figure 3. Comparison of simulation with analytical results in 
the presence of measurement errors. 19900 pairs of light-curves of 
length N = 1024 were created for Xj = Xj 



- eyj ■ Xj and Zj are independent time-series generated from 
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< <y\E > /^ 
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< gIe > /^ 



Axj 

a stochastic white noise process (i.e. power spectrum index o = 0) 
The measurement errors were simulated from a Gaussian distri- 
bution such that < cr^,^ >/< cr^ > = < (t|,^ >/< (t^ > = 0.5. 
Top Panel compares normalised histogram Cxy with a Gaus- 
<^ (j2 -^y sian with centroid at the expected < CxY > and width given by 
(7 = ACxy (Eon [23l l. Bottom panel shows the histogram of the 
cross-correlation variation (CxY— < C'xY >) / ^Cj^y - If AC^y 
fEgn [24t . which is estimated from the pair of light-curves, is a 
/Tyjj true measure of the variation, then the distribution should be a 

zero centred Gaussian with a = 1 (solid line). Note that in the 
presence of measurement errors < Cxy > can be greater than 
one. 



> 
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ACxY is in terms of ensemble averaged quantities which 
have to be estimated using the light curves. Hence 



(l-Ciy) 
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and 



Cx5 



Y = 




(23) with < ajcE >/< al > = < oye >/< c^l > = 0.5. Figure 3 
and 4 show the comparison of the distribution with the ex- 
pected Gaussian distribution for power spectral index a = Q 
and 1 respectively. As expected the distribution of Cxy is 
broader in the presence of measurement errors. Note that in 
this case Cxy can be greater than unity. The figures illus- 
trate that the variance estimations reasonably describe the 
simulated distributions. 

(24) 

The above results are for the case when the measure- 
ment errors are Gaussian distributions. We have verified that 
even when the mean counts per time bin is ~ 10, simi- 
lar results are obtained when the measurement errors are 
due to Poisson fluctuations. In order to correctly propagate 
the error and obtain Eqn (|23|) . it is implicitly assumed that 



(25) 

is the estimation of the variation in the presence of measure- 
ment errors. 

To validate the above results we generated 19900 pairs 
of the light-curves which obeyed Eqn (|19|l . The measure- 
ment errors were generated from a Gaussian distribution 



AcTx ^^; — {2/ ^1^)g\ ye « (^x.Y — '^x,YE- Note that 
these are also the criteria that any significant variability 
has been detected in each of the two light curves. In other 
words if the criterion is not satisfied for one of the light 
curves, this implies that there is no significantly excess vari- 
ance than expected from the measurement errors and hence 
a cross-correlation analysis cannot be undertaken. 

It is assumed that the measurement errors are random 
Gaussian fluctuations for which Agx^ye = 

(l/^)ai, 

YE- 
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Figure 4. Same as Figure 3, except that Xj and Zj are indepen- 
dent time-series generated from a stochastic l/f noise process 
(i.e. power spectrum index a = 1). 



If that is not the case (say for example, if the measurement 
errors arise from a 1// type fluctuations) then the appropri- 
ate value of Aax^YE fEan. ET|l may be used and Eqn (|25p be 
appropriately modified. However, if the measurement errors 
have unknown systematic variations, they naturally cannot 
be accounted for. 



Figure 5. Significance and cross-correlation function for two sim- 
ulated light-curves with measurement errors. Two pairs of light- 



curves of length N 
and Yi = Zi + Axi 



1024 were created for Xj 



+ eYj. 



and Zj are independent time- 



series generated from a stochastic l/f noise process (i.e. power 
spectrum index a = 1). The measurement errors were simu- 
lated from a Gaussian distribution such that < u^e "^^ ~ 
< >/< (7^ > = 0.5. The top panel shows the significance 
<^NH = {f^xyl/ ^c'^y as a function of r. Note that (Tjvh < 3 for 
all T except when |t| ~ 0. The bottom panel shows the cross- 
correlation function, CxviT) (Thick line), CxY {t) + ^C'^y ™d 
CxYir) - ^C'xY (Thin lines). 



3 THE CROSS-CORRELATION FUNCTION 

In general two light-curves may be linearly related to each 
other with a time lag r. To investigate such possibilities, 
one can calculate the cross-correlation function, Cxy{t) be- 
tween a light curve Xj and the Vj-fr- For light-curves of 
length A'^, there will be only N' (t) = N — t overlapping 
terms. Cxy{t) can be computed based on these N'{t) terms 
and then the definitions, error analysis of the previous sec- 
tions follow through without modifications. Such a definition 
of Cxy{t) has been c alled lo c ally d efined cross-correlation 
function (LDCCF) by IWelshI (|l999l ) and is different from 
the standard one. In the standard definition the length of 
the original light-curves A'^ is preserved either by padding 
the unknown part of the light with zeros (for the time do- 
main computation) or by repeating the series (in the Fourier 
domain computation). Here we consider only LDCCF for 
which the analysis mentioned in the earlier section holds. 

For every time lag, r, ctjvh — \cxy\/Ac'j.y needs to be 
computed to ascertain whether there is any detectable corre- 
lation. In Fig 5, aNH is plotted against r for two light curves 
generated using l/f stochastic process, with measurement 
errors and with A — 1. In fact, the two light-curves are the 
first pair of light-curves used in the simulation described in 
Fig 4. Since Cxy is being computed for a large number of 
time lags t, (although they are not independent, see below), 
it is prudent to keep a conservative criterion for correlation 
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Figure 6. The blown up portion of Fig 5 near t = shown for 
clarity. 
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detection, ctjvh > 3. It can be seen from the figure that 
lyNH < 3 for all values of r except when r ~ 0. As seen 
in the bottom panel of Fig 5, there are couple of peaks in 
Cxy{t) but none of them (except near r = 0) are signifi- 
cant. 

The Cxy{t) are not in general independent of each 
other. This is clearly seen as an example in Fig 6 where 
the region near t = has been expanded for clarity. For 
|t| < 3, crjvH > 3 and the dispersion of C'xy{t) is much less 
than expected from the error bars AC'xv This shows that 
it is incorrect to rebin Cxy{t) in r space. Instead if there is 
a justification to rebin in time, then the original light-curves 
should be rebinned and not C'xy{t). The data clearly shows 
a peak of Cxy{t) at r = indicating a time-lag consistent 
with zero. However, it is difficult to justify any error mea- 
surement on this time lag. In practise, one can represent the 
peak of the function as a Gaussian and take its width as an 
representative error for the lag. However, there are several 
un-attractive features of this technique. First, the width of 
Cxy{t) represents the auto-correlation of the coherent sig- 
nal rather than any error on the measured time-lag. Sec- 
ond, since the errors on Cxy{t) are correlated a formal fit 
is not allowed. Finally, the Gaussian fit will in general de- 
pend on the number of points used to represent the "peak" 
of Cxy{t). a justifiable technique to compute the error on 
the time-lag is required. 



4 THE CROSS-CORRELATION PHASOR 

The cross-correlation phasor can be defined as 



Cx5 



where 



CXY 



JV/2-1 



^^^ = ]v^ ^^^^ 



(26) 



(27) 



where the difference between the phasor and the cross- 
correlation is that for the phasor the summation is only over 
positive frequencies. They are related as Cxy ~ Re{CxY)- 
For partially correlated light curves with no phase lag, 
the ensemble average < Im{C'xY) >~ and the cross- 
correlation is given by < Re{CxY) >~< Cxy >■ By defini- 
tion, the deviation of Re{cxY) is the same as for cxy and 
is given by Eqn ((HJ i.e. ARe{cxY) = Acxy- The deviation 
of Im{cxY), is only due to the incoherent parts of the light 
curves and hence 



AIm{cxY) = AcxY\ l- ICxyP 



(28) 



If there is an intrinsic phase difference, < 4> > between 
the two light curves, then the ensemble average of Cxy will 
be a complex quantity given by < \Cxy\ > e'^**^. The 
deviation of A|Cxy| from < jCxyj > can be estimated 
by Eqn (|25p except that Cxy is to be replaced by ICxyl. 
The phase difference between the two light curves can be 
estimated as 



. Im{cxY) 
sin0 = — p-^ — 

\cxy\ 

whose error can be estimated to be 



(29) 




-2 2 

(|CJ - <|CJ>)/A|CJ 



-2 2 

(0- <0>)/A0 



Figure 7. Comparison of simulation witli analytical results. 
19900 pairs of light-curves of length N = 1024 were created for 



Zj are independent 



time-series generated from a stochastic 1// noise process (i.e. 
power spectrum index a = 1). The values of the parameters used 
for the simulations are A = 1 and <</>>= 1. Solid lines are the 
expected distribution for the error estimates A\Cxy\ and Aip. 
A\C'j^y\ given by Eqn II25I I except that Cxy is replaced by 
I Cxy I while Aip is given by Eqn I I30I I 



A0 = 



AcxY 

\cxy\ 



(30) 



To validate the above results we simulated the same set 
of light curves used for Figure (|4| i.e. using 19900 pairs of 
light curves with measurement errors and generated from 
a stochastic 1/f noise process. We introduced a phase dif- 
ference of = 1.0 between the coherent parts of the light 
curves. The histograms of jCxyl and (and their devia- 
tions) are plotted against the expected estimates in Figure 
7. 

If the coherent parts of the light curves have a time-lag, 
r, between them, then the cross-correlation phasor will have 
a non-zero phase. One can constrain the time-lag by shifting 
one of the light curves in time till the cross-correlation phase, 
(j) = O.ln other words, by defining a cross-correlation phasor 
function, 



CxY{r) = 



N/2-1 

E 

fc=i 



^ikr/N 



(31) 



one can obtain r such the phase of CxY{r), (f^ij) — 0. The 
error on r can be estimated by considering the range of r for 
which (^(r') ± A(j> is consistent with zero. Note that r need 
not be an integer and hence time-lags less than the time 
resolution of the light curves can be ascertained for good 
quality data. 

The above analysis is valid only when there is a de- 
tected correlation between the two light curves. To ascer- 
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tain whether there is a correlation (with phase lag) between 
the two, one needs to consider both the real and imaginary 
parts of cxY and compare with Acxy- While one can com- 
pute the joint probabilities, a more prudent and simpler ap- 
proach is to demand that a correlation is detected only if 
\cxy\/ Acxy > 3. If the condition is not satisfied one can put 
an upper limit on the correlation as SAc^y/ \l (j\a\. 



5 ALGORITHM TO COMPUTE 

CROSS-CORRELATION, PHASE AND TIME 
LAGS 

Based on the results of the earlier sections, we present here 
a complete self-contained description of the algorithm to 
compute and estimate errors for cross-correlation, phase and 
time lag between two light curves. It is assumed that there 
are two time series Xj and Y, of length N and for each data 
there are associated known measurement errors exj and eyj ■ 

Step 1: Calculate variances for the light curves. Compute 
the Fourier transforms of each light curve using 



JV-l 

Xfe = ^ Xj exp {2-Kijk/N) 



and similarly for Y^. Compute intrinsic variances, 

N/2-1 



(32) 



(33) 



where (Txe ~ Tf ^xj s-^d similarly for ayi- If either 

ax I < lAa^E = (M'^y\E or o\i < {4/^/N)alE, then 
no significant variation has been detected in one of the light 
curves and further analysis is not possible. In such cases, 
an upper limit of (4/\/7V)(T^g can be put for any intrinsic 
variation. If the there is significant variation detected, then 
the total error (both stochastic and measurement) on the 
variance is 



AaxT = 



2 

iV4 



JV/2-1 



of which the measurement uncertainty contributes 



JV/2-1 

(Aai^)2- A ^ (|X,|2-|Xm| = 

k = l 



(34) 



(35) 



where \Xm\^ ~ NaxE ''■nd is independent of k. 

Step 2: Compute the cross-Correlation. The non-normalised 
cross-correlation phasor is 



JV/2-1 

2 ~ ~* 

fc=i 

and its error is given by 

JV/2-1 
fc = -JV/2 



(36) 



(37) 



Check if \cxy\/ Acxy > 3. If not then no correlation is de- 
tected between the two light cur ves and t he upper limit on 
the cross-correlation is SAcxy/ crj^jayj- If the condition 
is satisfied (i.e. the correlation is detected) then the cross- 
correlation is 



\Cxy\ 



cxy\ 



(38) 



with error 

^A|Cxy|^2 _ ^A\CxYi\ 

\CxY 

where 



AlCxYil = 



Cxy)Acxy 



V^xi^ 



2 

YI 



Step 3: Compute the phase. The phase is given by 

. Im{cxY) 

sm(f> — — j j — 

\cxy\ 

whose error can be estimated to be 
AcxY 



CxY 



\Cxi 



(40) 



(41) 



(42) 



Step 4^ Compute the time delay between the light curves. 
Define 



Cxy{t') 



N/2-1 
k = l 



(43) 



and solve for <;/)(r) = to get an estimate of the time de- 
lay r. The error on r, At is to be estimated by consider- 
ing the range of r' for which 4>{t') ± A0(t') is consistent 
with zero. Compute the significance of the cross-correlation 
\cxy\/ AcxY at the two limits r' = r± At and consider the 
limits to be bona-fide if the significance is > 2, otherwise 
report that the particular limit on r cannot be obtained. 

Step 5: For multiple light curves or for a lightcurve divided in 
to segments find the weighted average of ayj and cxy, 
using their error estimates as weights. Then if the cross- 
correlation is significant, find the phase and time lags as in 
steps 3 and 4 above. 



6 APPLICATION TO AGN LIGHT CURVES 

To test and validate the effectiveness of the scheme, we 
analyse the lightcurve of a well studied Active Galactic Nu- 
cleus, Akn 564. The temporal and spectral properties of the 
source was st udied using an XMM-N ewton observation of 
the source by iDewangan et al.l l|2007l ). They computed the 
cross-correlation function for different energy bands and es- 
timated a possible time-lag between the hard and soft bands 
to be ~ 1768 sees using the peak of the function as a mea- 
sure. Using Monte Carlo simulations they estimated an error 
on the time la g to be ~ 1 00 sees due to m e asure ment error. 
lArevalo et"al] i20061) and iMcHardv et all l|2007l ') computed 
time-lags as function of frequency for ASCA and XMM- 
Newton observations of the source and found that the there 
is a sharp drop in time lag for frequencies greater than 10~* 
Hz. 
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Figure 8. The variability property of Akn 564. Lightcurves of 
the source in different energy bands were used for the analysis. 
The time-bin for the light curves is 64 seconds and the number 
of data points is 1463. The rms, cross-correlation (ICxyDi the 
phase difference {</)) ^^nd the time lag (r) are plotted with energy. 
The reference energy band is 0.2-0.3 keV. 



We extracted light curves of the source using the XMM- 
Newton observation, in differen t energy bands. Details of the 
extraction process are given in lDewangan et aP ()2007l '). The 
usable continuous time duration for the observation is for 
~ 10^ sees. Our motivation here is not to analyse in de- 
tail the temporal properties of the source and their physi- 
cal interpretation, but instead to show as an example and 
validate the method described in this work. Thus, while 
finer time binning of the data is possible, we restrict our 
analysis to 64 sec bins, which resulted in light curves with 
length A'^ = 1426. Figure [8] shows the results of our analysis. 
Note that the cross-correlation, phase and time-lag are well 
constrained as a function of energy. Figure [5] shows the re- 
sults of the analysis when the light curves were divided into 
ten segments and the results averaged as described in the 
last section. Note that again the physical quantities are well 
constrained and while the phase difference is relatively un- 
changed between the two analysis, the time-lag decreases by 
nearly an order of magnitude. This is consistent with earlier 
results that the time-lag decreases with increasing Fourier 
frequency. 

Splitting the lightcurve into segments and taking the 
average assumes that the during the time-scale, the source 
was stationary. This can be now explicitly tested using the 
analytical error estimates for each segment. This is demon- 
strated in Fig IIOI where for each ten segments, the r.m.s, 
cross-correlations and phase-lags (between the energy bands 
0.2-1 and 1-2 keV) are shown. The cross-correlations and 
phase lags are consistent with being a constant equal to 
the averaged value (shown as a dashed line). Formally the 
X^/dof for the data points to be constant are 5.3/9 and 
3.3/9 for the cross-correlation and phase lag respectively. 



Figure 9. Same as Figure |8] except that the light curves were 
divided into ten segments and the results averaged. 
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Figure 10. Checking the stationarity of the X-ray lightcurve of 
Akn 564. The complete lightcurve in the 1-2 keV band (shown 
in the top panel) has been divide into ten segments. For each 
segment the r.m.s is shown with errors in the second panel. The 
dashed line represents the average value. The cross-correlations, 
\Cxy\)) S'lid phase lags, between 0.2 — 1. and 1-2 keV bands for 
each segment are shown in the bottom two panels. The dashed 
lines represent the average values. It can seen that the cross- 
correlation and the phase lag are consistent with being a constant 
showing that the system is stationary in these time-scales. 
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The slightly lower value of than expected indicates only 
a slight overestimation of the error bars, especially for the 
phase lags. This is probably because for each segment, the 
error on the phase lag A(j) is large and hence the error dis- 
tribution maybe slightly different than a Gaussian. Never- 
theless, the figure clearly shows that not only can station- 
arity be tested but also reconfirms that the error estimates 
are reliable. Perhaps, not surprisingly, given the shape of 
the total light curve, the r.m.s of the 1-2 keV energy band 
is formally not consistent with being a constant, with a 
X^/dof = 28/10. This is primarily due to the fifth segment 
where the r.m.s and its error is small. Note that the error 
is computed by assuming that the power spectrum of the 
segment is representative of the ensemble average. Perhaps 
a more prudent approach would be to estimate the error on 
the r.m.s using the averaged power spectrum rather than for 
each individual segment. However, whether such deviations 
are a significant indication of departure from stationarity is 
arguable and subjective. Hence, we recommend that only 
large deviations (for e.g. x2/rfo/ > 5) should be taken as 
serious evidence for non-stationarity. 



7 SUMMARY AND DISCUSSION 

An analytical estimate for the significance and error on 
the cross-correlation, phase and time lag between two light 
curves is presented. The error estimates take into account 
the stochastic fluctuations of the lightcurve as well as any 
known measurement errors. The technique has been verified 
using simulations of light curves generated from both white 
and 1// stochastic processes with and without intrinsic cor- 
relation between them. The entire analysis consists of five 
algorithmic steps which are described in §6. The technique is 
ideally suited for short light curves of length ~ 1000 and is 
an improvement over earlier methods which were based on 
numerically expensive simulations or by dividing the data 
into number of segments to find the variance. 

The analytical estimate presented is based on several 
assumptions and hence is reliable only when they are valid. 
We emphasize this point by enumerating some of the main 
assumptions. 

• Both the light curves have been generated from stochas- 
tic processes. Technically, this means that the phase of the 
different Fourier components are unrelated to each other 
i.e. < XkXi >— Sik- This assumption will be violated if 
the generation mechanism is a non-linear one. In general, 
it is difficult to ascertain the degree of non-linearity in a 
short lightcurve and it requires sensitive analysis like Bi- 
coherence measure and/or non- linear time series analysis. 
Thus, in most cases, the stochastic nature of the light curves 
have to be assumed. It is prudent to be aware that this as- 
sumption has been made and its validity is unknown, like 
for example, for the prompt emission of Gamma-ray bursts. 
A simple case where this assumption will be violated is if 
the power spectra have dominant harmonic features, where 
the power in the harmonics is comparable to that of the 
primary. 

• The measurement errors are uncorrelated and have Gaus- 
sian distributions. The essential assumption is that the 
power spectrum of the measurement errors is independent 
of frequency (i.e. a white noise) and their phases are inde- 



pendent of each other. If the power spectrum has a differ- 
ent shape, then the appropriate changes have to be made 
and the basic results of this work need to be re-derived. 
For most practical purposes if the measurement errors are 
known, they usually are Gaussian distributions and hence 
this assumption is valid. If there are unknown systematic 
errors in the light curves then of course the analysis will 
not be applicable. Poisson distributions have the white noise 
property but in general the phases of the different Fourier 
components may be related. We have verified that for counts 
per time bin ~ 10, the results of this analysis is valid. For 
counts less than that, caution is advised. However, for such 
low counts, meaningful results can only be obtained for long 
time series and it may better to obtain frequency dependent 
coherence and lag measurements. 

• The light curves are evenly sampled without gaps. For un- 
evenly sampled data the cross-correlation can be estimated 
(|Edelson fc Kroliklll988D . but there does not seem to be an 
analytical way to estimate the significance and error. One 
needs to use either Monte Carlo simulations or more prac- 
tically estimate the error by dividing the light curves into 
several segments and finding their variance. 

• The light curves are stationary. As shown in the example of 
the light curves of Akn 564, this assumption can be tested by 
dividing the light curves into segments and checking whether 
the r.m.s, cross-correlation and phase lags are consistent to 
be a constant for different segments. 

While this technique is useful for short duration light 
curves, coherence and frequency dependent time lags pro- 
vide naturally more information and should be preferentially 
computed for long data streams. This technique may not 
be unique or optimal and hence there is a possibility and 
need for development of better methods provided they give 
robust and physically interpretable results. Finally, while 
cross-correlation, phase and time lags provide a quantita- 
tive measure of the system, their physical interpretation has 
to be done in terms of the physical geometrical and radiative 
model assumed for the system. 
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APPENDIX A: THE AVERAGE DEVIATION OF NON NORMALISED CROSS-CORRELATION 

We define the non normalised cross-correlation between two time-series as 

N/2-1 N/2-1 

'=^^ = i^ E -^^^'='-^2 E ^^^-^ (Ai) 

k=-N/2 k=-N/2 

Without loss of generality the two times-series may be split into coherent and incoherent components as 

-^k — ^k + ^k 

n = AX'k+Yr (A2) 

such that the ensemble averages < X^X"% >,< X^'^-k > and < X/^^Y"^ > are all zero. Then < cxy >= AJ2 < X^X^^. > 
and 

< >'= ^ E E < X ^-^-^ >= ^ E E < >< > (^3) 

where the implicit index of summation over k and m has been omitted for clarity. The average of the square, 

<CxY> = 7^ < E E XkY-kXmY-m > 

= E E < (^^')' X (^-)' > E < (^^')' > + E < (^^')' X > 



m 
1 

Now since < (X^)'' >= 2 < {X^)'^ the above expression can be simplified to 



7V4 ^ \'-ml - I 

Thus, 

JV/2-1 

iV4 



(Acxrf =< 4y > - < cxr >'= ^ < X > (A6) 

fc=-Ar/2 

If Xfe = (i.e. the two time-series are identical) then cxy = Ux and hence 

N/2-1 

(^^-)'=i^ E (A7) 

fc=-]V/2 

In the special case that the time-series is produced by a stochastic white noise (i.e. \Xk\'^ is a constant independent of k) then 
Aajc = (A8) 



